An Api for Transparent Distributed Vertical Data Mining

نویسندگان

  • Masum Serazi
  • Amal Perera
  • Taufik Abidin
  • George Hamer
  • William Perrizo
چکیده

New data mining tools and algorithms are available for vertical data mining communities for scalable and efficient data mining to discover the hidden nuggets from huge repositories of data. Most of the traditional data mining algorithms do not scale on these huge datasets. This is due to insufficient computational resources, currently available on a single machine for running these applications. Distributed computing based in a vertical data structure presents an effective solution for dealing with the computational load incurred in data mining applications. However, currently there exists no such API for a vertical data structure on a distributed environment. As a result an API for a distributed vertical data mining is necessary for steering the future development of data mining algorithms to provide further scalability for large datasets. In this paper we present an API for a distributed vertical data mining environment providing different levels of transparency and flexibility using MPI as a distributed computing environment and C++ as a programming language.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Computational Fluid Dynamics Simulation and Experimental Validation of Hydraulic Performance of a Vertical Suspended API Pump (RESEARCH NOTE)

For a long period of time, design and manufacturing technology of high flow rated vertically suspended pumps (VSPs) which have an extensive applications in many industries such as water and wastewater, mining, petrochemical and oil and gas industries, used to be imported from European countries. For the first time in Iran's pump industry, with the support of Ministry of Petrochemical[ah1]  and ...

متن کامل

The WebSocket API as supporting technology for distributed and agent-driven data mining

Supporting technologies play an important role in distributed data mining systems. The flexibility and the scalability of infrastructures and architectures can often determine the strength of a distributed data mining framework. In this paper we present some preliminary research work on a prototype for a distributed data miming framework. We shall show how the WebSocket API, which is a draft sp...

متن کامل

Secure Association Rule Mining for Distributed Level Hierarchy in Web

Data mining technology can analyze massive data and it play very important role in many domains, if it used improperly it can also cause some new problem of information security. Thus several privacy preserving techniques for association rule mining have also been proposed in the past few years. Various algorithms have been developed for centralized data, while others refer to distributed data ...

متن کامل

Optimization of Distributed Association Rule Mining Based Partial Vertical Partitioning

Association rule mining is a one of the most important technique in data mining. Data mining is the process of analyzing data from different angles & getting useful information about data. Modern organizations are geographically distributed. Using the traditional centralized association rule mining to discover useful patterns in such distributed system is not always feasible because merging dat...

متن کامل

An Efficient Data Indexing Approach on Hadoop Using Java Persistence API

Data indexing is common in data mining when working with high-dimensional, large-scale data sets. Hadoop, a cloud computing project using the MapReduce framework in Java, has become of significant interest in distributed data mining. To resolve problems of globalization, random-write and duration in Hadoop, a data indexing approach on Hadoop using the Java Persistence API (JPA) is elaborated in...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005